This week I dipped my toe, so to speak, into the stage 2 processing our lab does in the project. I compiled a glossary and brief explaination of how neural networks work for my own and others's reference. I experimented with networks to discover for our set-up having smaller batch sizes is best for accuracy. I attempted to implement a new network on our system but did not finish. I attended a conference for robotics at Yale; a relevant paper is linked.
This day I experimented with the second stage of our perception: this part uses depth measurements in our RGB-D images and the bounding boxes from the first stage (neural networks).
This day I realized it was not sustainable my ignorance on the entire topics and phrases and words my coworkers said and papers I read. I spent the entire day reading up on any word or phrase I did not understand. This site took up my entire day and it was entirely worth it - I now have any understanding of neural networks when before I had nearly nothing.
Concurrently, I experimented with batchsize affecting accuracy. Before this day, when I initially trained networks it was @ bs = 10, which seemed fine to me, but I checked the paper published by the lab recently and it claimed accuracies 20-40% higher than mine. I checked the files used for the paper and found they had used bs = 1; I trained networks with that setting and achieved accuracies the same or better than the paper's. Today I trained AlexNet with bs = 92, the biggest the GPU could manage (increasing bs linearly increases GPU RAM usage); it had an accuracy 50% below what I was getting @ bs = 10 and 70% below the paper. Bizarre.
This day I established a intro paper and glossary for new members to the lab to use, using that page from the previous day as a major reference. It should greatly help the on-boarding process for whomever comes after me. I am sure I will have used it for my own reference throughout my experience.
Concurrently, I had eight new models trained for AlexNet, differentiated by their batchsizes (1, 2, 5, 10, 15, 25, 50, 75) to compare to the results from the model where bs = 92. In addition to accuracies I also recorded time taken to run ten epochs; I had noticed a difference in training time and, while I doubted it would differ too much, I was curious.
As it turned out, the accuracies for the models dropped hugely as batchsize increased. The standard deviation went up some but plateaued. Time taken to train en epochs dropped by about six minutes by bs = 5 but stayed about level after that. I had thought training time would decrease at least somewhat linearly but it more appeared to logarithmically decrease to <25min. Besides time, the results were as expected.
Afterward, I attended a Skype call conference where the director with me and my mentor asked questions of the person who had last been working on what is now my mentor's part.
This day I began rereading those papers I had not understood since now I should have been able to understand them.
Afterward, I and my coworker added background images to the training set so as to (with hope) allow the networks to better defend against false positives. I made a script that would look up all images in the data set that do not have the necessary metadata for ground truth (since none of the fresh-made background images, which were sliced off of the existing photos, had) and would automatically generate that metadata file. Once all the background images were added, training was attempted on the full data set. It did not work; some error kept coming up. It was decided to roll back the data set to without background images and return to this task another time.
Afterward, I attempted to get squeezenet working on our system. I ran into many issues that to me made no sense; it would seem very few people have attempted this before using the libraries we use, or few people have problems, since there were sparse 'help' threads anywhere close to what I needed to do and none of them were directly related to my issues. It was determined by the director that I install a seperate library (Caffe) since it might be usable and allow me to circumvent my problems with our current library (PyTorch). I did not get far in the installation process, I will have picked up on Monday.
This day I did not go into the office; the lab instead attended a conference at Yale. The conference (NEMS 2018) was focused largely on robots which is outside our interests but there was one presentation in particular, on perception in warehouse tasks (paper here), that was relevant.